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Oesersptfors 

HELD OF THE INVENTION 

s This invention refutes to a biologically pure ONA signal sequence which encodes an amine acid signal 
peptide necessary for directing the secretion from certain defined hosts of proteins in bioaetive fcsro. 

8ACK6BOUN0 OF THE INVENTION 

jo In the tJioksgicas production of cornmerelally viable proieins by the fenriemation of microorganisms, mo 
ability to produce the desired proteins by fermentation wffti secrete of the proteins by the microorganisms 
in so the broth is very significant. However, there are many commercially viable proteins encoded by 
genettcaiiy engineered DNA constructs which are not secreted by ihe celts so which She DNA is expressed. 
This offers necessitates harvesting the ceils, bursting the ceii waffs, recovering the desired proteins in pure 

ts form and: then chemically re-naturirsg the pyre materia* to restore its bioacfive function. This downstream 
processing, as it is called, is illustrated to Figaro 1 . 

Some ceils and microorganisms carry out she biological equivalent of downstream processing by 
secreting proteins in Oioactive form. The mechanism which directs the secretion of some protases through 
the ceil walls is not futty understood. For exampfe, in Sirep^yces gnseus, an organism used for the 

se commercial production of Pronasa, ihe species secretes many extra eoiktiar proteins {Juresek, L, F. 
dohosoa, R.W. Oiafson, and L.B, Smiliie {19?1), Af) improved fractionation system for pronase on CM- 
Sephadex Can. J. Biocham., 49:11 95-1 201). Protease A arid protease 8, two of the serine proteases 
secreted by S. gdseus, have sequences which tire 01% homologous on the basis of amino acid identity 
(Fujirsaga M., LTJ. Deibaere, G.O. Stayer, artd M.N.G. James <1985), Refined structure c>? a-fytfo protease 

as at t.7A resofotion; Analysis of hyrodgen bonding and solvent structure, J. Mot. Biol, tSS:479*S0S; Jeras&fc, 
L, MM. Carpenter, L.B. Sroiie. A. Gerfler. S, levy, and LH. Ericsson {1974), Amino acid sequencing of 
Sfreptomyoas gtiseos protease 8, A major comporteei of pronase , Biochem . Biophys. Res. Comm., 
il7i09^ ( 1 878), Series proteases, In 

M.O. Dayhoff (&!)< Atias of Protein Sequence and Structure 5, suppi. 3:73-83). These proteases afso have 

$0 similar tertiary structure, as determined by X-rey crysiaffography (Deibaere, LTJ,, W.l.B. Hotchaon, fvlN.G. 
James, and W,E. Thiessen ( f 975), Tertiary structural differences between microbial serine proteases and 
^^^^■■■■f^^.v.^y^ 08 ' Natoa 758-763; Fujteaga. M„ LTJ, Deibaere, 6.0 Bmyer, god M.N.6. 
James (1885). Refioeri structure of w-lytfc protease a! 1.7 A resolution; Analysis of hyrodgeo bonding and 
aply^ Structure , J. Mel. Sto)., fS3.479-5<52; James, M.N.6.. A.R. Sietecfci. G.D. Brayer. LTJ. Deibaere, and 

ss C.-A. Bauer {1080), Structures of product and inhibitor complexes of Sfmpfomyces griseus protease A at 
1.8. A .resolution, J. Mo! Bioi., 14443-88) Although the structures of proteases A and 8 have been 
extensively studied, the genes encoding these proteases have not been characterized before. EP-A-0 222 
279 discloses signal peptides derived from Strepfomycee. 

4i-, SUMMARY OF THE INVENTION 



in accordance with this invention, the genes encoding protease A end protease 8 of S. griseus have 
been Isolated and investigated to reveal DMA sequences which each direct the secretion of an encoded 
protein fused either directly or indirectly to a sigoaf peptide encoded by the DNA. 
*s According to an aspect of Ihe invention, a recombinant DNA sequence comprises a signal sequence 
and a gene sequence encoding a protein. The recombinant DNA sequence, when expressed in a living cell, 
encodes an amino acid signs! peptide with the protein. The signal peptide directs secretion of the protein 
from a col! within which the ONA signal sequence is expressed. 

According to another aspect of the invention, a biologically pure isolated DNA sigoaf sequence encodes 
so a 38 amino acid signal peptide which directs secretion of a recombinant gens-sou r eed protein linked to 
such 38 amino scid signal peptide, from a eeff fa which the ONA signal sequence is expressed. The DNA 
signal sequence is isolated from Sireptornyces grieeus. 

According to another aspect of the invention, the ONA signal sequence in conjunction with a gene 
sequence encoding a protein is inserted into a vector, such as a plasmkl or a phage, 
ss According to another aspect of the invention, the ONA signal sequence is adapted for expression is a 
living ceil having eretymes catalyzing the forreatioe Of dlselpftide bonds. 

Aeeordiftc to another aspect of the foveote, the bioiogicaiiy pure isolated DNA signal sequence of 
figure 4a, 
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According to another aspect of the invention, the biologically pure isolated ONA signal sequence of 
Figure oa. 

According to another aspect of the frwenSon, a fused protein fe encoded by the recombinant DNA 
sequence of Figure 4 or Figure 5 
s According to another aspect of the mvonfec a transformed prokaryotic cell is provided which has 
insetted Iherein a suitable vector including the recombinant ONA eacodirtg the signal protein. The 
transformed prokaryotic eel may be selected from the Streptooiyees genera. 

According to another aspect of the invention, a bioksgicaly pure culture has a transformed prokaryotic 
ceil with the recombinant: DNA sequence in a seftsfote vector. The culture is capable of producing, as an 
k.< intermediate, the fused protein of the amino acid signal peptide end the protein. The protein itseif is 
produced in a recoverable quantity upon fermentattee of the transformed celi in an aqueous nutrient 
medium. The signal peptide directs secretion of the protein from {he ceil. 

According so another aspect of the invention, a biologically pure culture, transformed with the functional 
signal sequence as described above, is able to direct the secretion from the cell of proteins whose 
n b*oac*i«!y Is dependent upon the formation of corracrjy positioned fofiramotecutar disulphide bends. 

A biologically pure ONA sequence encoding a fused protein inciuding protease A has the combined 
DNA sequaroi of Figures 4a, 4f> and 4c. 

A biologically pure ONA sequence encoding a fused protein including protease 8 has the combined 
ONA .sequence of Figures Sa, Sfo and Sc. 

m 

BRIEF DESCRIPTION OF THE DRAWINGS 

With reference to the Figures, a variety of shod forms have been used to identify restriction 
endonueieases, amino acids. deo:<ydx>nucieie acids and related information. Standard nomenclature has 
as been used in identifying ail of these components as are readily appreciated by those skilled in the art. 
Preferred embodiments of fhe invention are described with respect to the drawings, wherein: 
Figure t Illustrates downstream processing; 

Figure 5? shows restriction eedonuetease maps of DNA fragments of sprA and sprS; 
Figure 3 tiiuafraisa restriction sodonaeiease maps and sequencing strategies in sequencing: DNA 
a» fragments containing sprA and $$>t&; 
Figure 4 is the ONA sequence of sprA; 

Figure 4a is the DNA sequence encoding she apt A (protease A) signal peptide; 
Figure 41) is the DNA sequence encoding the sprA {protease A) propeptide: 
Figure 4c is the DNA sequence encoding matures protease A; 
as Figure 5 is fhe DNA sequence of sprB: 

Figure Sa is the DNA sequence encoding the sgrB (protease B) signal peptide; 
Figure 5b is the DNA sequence encoding the sprB (protease B) propeptide; 
Figure 8c is the DNA sequence encoding mature protease 8, 

Figure 0 is an alignment of the amino acid sequences deduced from sprA and sprB to develop homology 
*> between the iwo sequences, 

DETAILED DESCRiPTiON OF THE PREFERRED EMBODIMENTS 

The organism Strepfomyces griseos is a well recognised microorganism, it is commercially used for She 
45 production of the enzyme Pronase. it is appreciated, however, thai {his organism also secretes two 
enzymes, protease A and protease B. which are both serine proteases. Although the structure of proteases 
A and S have been extensively studied, the genes encoding these proteins, and the manner in which this 
genetic information is used so signal secretion by the cefis, is not understood. According to this invention, 
fhe genes which encode protease A and protease 8 and provide for the secretion of these proteins in 
55 bioactive form have been discovered, it has been determined that each, of protease A and 8 is included in a 
precursor protein which is processed to remove an ammo-terroina! polypeptide portion from the mature 
protease. If has further been determined that each of protease A and 8 precursor proteins is eozymatiealiy 
processed to form cooactly-positioned intfamotecutef disufphide bonds, which processing is concomitant 
with removal of the amino terminal addressing peptide from the mature precursor. The discovered genes, 
m which encode proteases A and 0, their intermediate address-competent forms, and their control elements, 
have been designated sprA and apt 8. 

As discussed in the following arltcies, durasak, L, M.R, Carpenter. 1,8. Smiflie, A, GerSlec S, Levy, and 
Lit Ericsson (1974), Amjn^ 



3 



EP 0 300 466 B1 



prenase,, Biochem Biophys. Res. Comm. 61:1095-1100; Young, CI.. W.C. Barker, CM. Tomasetii, and 
M.O Dayboif {1878). Serine proteases, to MO. Dayhoif (ed,), Atias of Protein Sequence and Structure 5, 
seppi. 3:73-83. proteases A and 8 am homologous proteins eontoining several segments of Identical amino 
acid sequence. In accordance with this invention, the genetic cede., which makes and directs the secretion 

s of each of proteases A and B, has identical DMA sequences corresponding to the regions of ktenticaiify for 
the homologous proteins proteases A aftd B. In order to isotete the genes, {his assumption, that ktentieafify 
in portions of the gene sequences would occur, was made so that an eSgomiCteotide probe could be 
designed from one of the similar regions in the sequences. 

in order to extrapolate the gene sequence which wouid encode the similar amino acid sequence, the 

;» knows eodoo bias tot Streptomyces was raised upon to develop the nucleotide probe (see Bernsn, V., D, 
Filpufa, W, Herber, fVf. Bibb, and E. Kate (!98S), The nucleotide sequence of the tyrosinase gene from 
Streptomyces antibiotics and characterization of the gone product, Gene 3? fdi-i 10; Bibb, M,j.. M.J. Bibb, 
J M Ward, S.N. Cohen {1985), Nucleotide sequences encoding and promoting expression of three antibiotic 
resistance genes indigenous to Streptomyces., Mol. Gen. Genet, 199:26-58; (Thompson, C.j„ and G.S. 

reiafionship so phosp hot ran sf erases ^codedity" tesis^Ke"pl8^ds.'^oc. Na¥ A^^Scl' J^A'^^i'^D- 
5194), ?>nee the praise was constructed, ii was tosn possible to probe the ONA sequences of S, gnseus to 
determine if there were any corresponding nocteie acid sequences in Ihe micfoofgaoism. Since ii was 
town that there ware two proteases, A and B, the oiigonucieottde probe should have revealed two DMA 

so fragments detected by hybridisation analysis, and in fact, not only did the probe hybridize equally to two 
fragments generated in the genomic library oi S. griseus, hot also two fragments generated by BamHI 
digest (8.4 kb and 6.8 Kb) or BgSfi (1i Kb and 2.8 kb} were isolated from the genomic library . As a cross- 
check with respect to the predictability of such probe, the same fragments ware detected in genomic ONA 
libraries of other isolates of S. p/iscus II was noted, however, that (here was no sod's hybridization of ihe 

m oiigorsodaotide probe with ONA from other Streptomyces such as S. Ilvkians. 

Plasmids were constructed containing digested fragments of S, griseus The oligonucleotide probe was 

- - - , .yi ........... . * 

used to isolate developed piasmids containing sprA and sprB, The screwing by use of the probe was 
accomplished by colony blot hybridisation where approximately 16.000 P. coti Pansier maois containing the 
developed piasmids were screened. Twelve franaiorrnauis were detected by the probe end isolated for 

as f«dher characterisation, These colonies contained tee distinct classes oi plasmid based on restriction 
analysis, As determined from the hybridization of genomic ONA. the ptesmids contained either the 8.8 kb or 
the 8.4 kb BamHI fragment. These fragments contained the sprA and sprB genes. 

The fragments as isolated by hybridisation screening were: tested for the expression of proteolytic 
activity. With these piasmids identified, such characterisation may he accomplished in accordance with a 

gs variety of known techniques in accordance with a preferred embodiment of this invention. 

The 0.8 kb and 84 kb Bam HI fragments were ligated info the Bglil site of the vector pU70£. 
Trarssformanis of S. iividans containing these constructions were tested on a milk plate tor secretion of 
proteases, A clear zone, which represented the degradation of the milk proteins, surrounded each 
fmnsformant that contained either BamHI fragment If was noted that the clear zones were not found around 

«> S. Iividans colonies which contained either pU7G2 only or no plasmid construct 

Proteolytic activity was also observed when the BamHI fragments were cloned in either orientation with 
respect to the vector, thereby minimising the possibility of read-through transcription of an incomplete 
protease gene. This observation provides evidence that the two BamHi fragments contain an Intact protease 
gene which is capable of affecting secretion in a different Sheptomyces spades, as tor example die 8. 

48 Iividans. With this particularly relevant characterisation of tee BamHI fragment, and knowing that the desired 
gene was in these fragments, it was possible to isolate and to sequence the genes encoding protease A 
and protease B. 

According to a preferred aspect of this invention, the pedicular protease gene contained within each 
cloned Barrel! fragment was determined by dideexy sequencing of ihe piasmids using the oligonucleotide 

so probe as a primer in such analysis. The 8,4 kb BamHI fragment was found to contain sprB, because a 
polypeptide deduced from the DNA sequence matched a unique segment of the known amino acid 
sequence of protease B. The 8,8 kb BamHI fragment contained toe sprA by process of elimination. The 
protease genes in these fragments were localised by digesting the piasmids and determining which of the 
restriction fragments of the piasmids were capsbie of hybridising to the oligonucleotide probe. 

ss Figure 2 shows detailed restriction maps of the 6,8 kb and 8,4 kb BamHI fragments. Hybridization to the 
oligonucleotide probe was confined to a 0.3 kb Pvuli-Sfef fragment of sprA and a 0.8 kb PvyihPvui fragment 
of sprB, Soeh hybridisation Is indicated by the heavy fines te figure 2, Hybridisation to the cloned BamHI 
fragments and the 2,8 kb Bglil fragment of sprB agrees with the hybridisation to BamHI and Bgili fragments 
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of genomic DNA Thus, rearrangment of the BamBf fragments containing the protease genes is unlikely so 
have ©ecus ted. 

The functional portions of the sprA- and gprfl-contatofog DNA were determined by subeioeing restriction 
fragments thereof into pij?02. The constructed piasmids were transformed into S. Svidans and tested for 

s proteolytic activity. The 3.2 kb BamHP Bgitl fragment of sprA and the 2.8 kb Bgiil fragment of sprB, wrien 
sobclooed into pU7G5> in either orientation, resulted in the secretion of protease from S. fividans, The Intact 
protease genes were further delimited to a I ,9 kb Siui fragment for sprA and a i A kb BssHit fragment for 
sprB, With reference to Figure 2 ; each of these fonefiorially active subclones are indicated below the 
restriction maps which contain the region for each gens which hybridized ;o the oligonucleotide probe. 

to in order to determine She nucleic acid seqaenes of the protease genes, the 3 2 kb BamHi-BgSfl fragment 
of sprA and the 2.8 ko BglB fragment of sprB were subcioried into puC'18 to facilitate further structural 
characterization. As shown in Figure 3, the restriction maps of these subclones and the strategies which 
were used to sequence the 1 ,4 kb Sail fragment containing sprA and 8te 1 .4 kb BssHIS fragment containing 
sprB are shown. The resultant DNA sequences of spfA and sp?8 are shown in Figures 4 and 5, 

js respectively The predicted amino acid sequence of protease A differed from the published sequence by 
the amSrtatioo of amino acid 133. whereas thai of protease 8 was identical to the published sequence, (see 
Fujinaga, M„ LT.J, Detbaere, 0.D. Brayer, ami M.N.G. dames (1985). Refined struciuss of _<*~iytic Pfojease 
at 1,7 A resolution; Analysts (if hyrodQen bonding and solvent structure. J. Moi. Blot. 183.479-602). 

Analysing the sequences of Figures 4 and 5, it is apparent that each sequent® contains a targe opm 

m reading frame with the coding region of the mature protease situated at the 3' end. For the protease A and 
protease 8 genes, the sequence encoding the airfxsey-tetfnines of the protease is followed immediately by 
a translation stop cotton. At the other end of the sequence, the predicted amino acid sequences appear to 
extend beyond the ammo-termini of the mature proteases A and 8 by an additions! 1 18 amino acids for 
sprA of Figure 4 and 114 amino acids for spr8 of Figure B. The putative GTG initiation codons at each of 

5S these positions {-118 for Figure 4; -114 for Figure 5) ate each preceded by a potential rifoosoroe binding site 
(as indicated by the series of five dots above the sequence) and followed by a sequence which encodes & 
signal peptide. The processing site for the signet peptidase (identified by the fight arrow in Figures 4 and 5} 
is predicted ttt 38 amino acids from the amioo-terminos of the putative precursor. [For clarity, that pad of 
the nucleic acid s«qoen<5ss of Figures 4 and 5 corresponding to the signai peptide portion of SprA and aprB 

so is reproduced in Figures 4A and SA, respectively]. The propeptide is encoded by the fen-saining sequence 
between the signal processing site (Sight arrow) and the start of fbe mature protein (indicated at the dark 
arrow}, (for clarity, that pad of nucleic acid sequences of Figures 4 and 5 corresponding to the propeptide 
portion of SprA and sprB is reproduced in Figures 4B and SB, respectively}. The mature protease is 
encoded fey the coder, sequence 1 through 181 lor Figure 4 and 1 through IBS tor Figure 5, [For clarify. 

38 that part of the nucleic acid sequences of Figures 4 and 5 corresponding to the mature protein portion of 
sprA and sprB is reproduced in Figures 40 and 5C, respectively} The amino acid sequence for codons 
•11© through +181 of Figure 4 and the amine add sequence for codons -114 through +185 of Figure 5, 
when made in the living cell S. goseus. are acted upon in a manner to produce in the culture medium 
externally of the living cells the mature bioactive ensymes protease A and protease B. The processing 

*> involved in accordance wish the contained information encoded by that portion of the g&m from atari of the 
promoter to start of the mature protein in each case included providing a secretory address, the correct 
signal peptide processing site, the necessary propeptide structure not only for secretion but also lor correct 
disuspbide bond formation concomitant with secretion, and competent secretion in bioactive form. 

in accordance with this invention, the ability of the sigeal peptide to direct the secretion of bio&oiive 

45 protein was established by inserting Known DNA sequences at ine beginning and at the end of known 
sequences. For example, consider the sequence shown in Figure 5. in particular, the promoter and initiator 
ATG of the aminoglycoside phosphotransferase gene, (Thompson,. C.J.. and G.S Gray (1 983). Nucleotide 
sequence of a sfreptomyceie aminoglycoside phowhotransierase gone and its relationship to phosphotran- 
sferases encoded by resistance piasmids. Free. Natl Acad. Sci. USA, 80:5190-5194) had been inserted 

so preceding the second codon (AQG at -iiS) of the signai sequence of Figure S. Due to the insertion of this 
new promoter and initiator, the sprB gene, now under the control of this non-native promoter, directed both 
elevated leveis and earlier expression of proteolytic activity when compared with the unaltered sprB gene. 
The secretion of bioactive protease B m this construction indicated that nucleic acid sequences preceding 
the GTG initiation codon at -114 are not required for lee correct secretion of the protease B in bioactive 

m form, provided an active and competent promoter is pieced In the precise location indicated. 

in order further to demonstrate the universality of file discovered slgna! peptide, the sprB coding region 
was replaced mlh a gene sequence encoding the mature amylase from S, griseos. Hence the nucleic acid 
sequence encoding the amylase was inserted in place of ftie sequence of Figure 5 to the right of the light 
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8rtw. it was determined that the resulting genetic construction Erected the production of an extracellular 
protein having an N-termlo&i alanine, property positioned infrsmofecufar dlsuiphide bonds, and exhibiting 
arnyioiytie aeSvrfy a! a level comparable to that of a similar construction with the natural signal peptide of 
amylase. In accordance with this invention, the 38 amino acid signal peptide of Figures 4 and 4A and s and 

s 5A is sufficient to direct the secretion of non-native protein in bsoaetsve form. 

Since both signal sequences encode tor fhe signal peptides of figures 4 and 4A and S and 5A, the 
organisation of the coding regions of sgrA and sgr8 were investigated by comparing the amino acid 
homology of She encoded peptide sequences. Such comparisons are set out in Figure 8 where amino acid 
homology has beet; compared for the signal peptide of Figure 6a, the propeptide of Figure fib aod the 

;s mature protease of Figure 8c. A summary of such homology is provided In the following Table I. 



TABLE f 



rs 



Homology of sprA and sprB Coding Regions 



DNA Homology % 




a arnino-terrnioi oi mature proteases (amino acids 1-8?) 
f> o&rboxy-terrnioi o! mature proteases (amino acids 88-190} 



The alignment of amino acid sequences translated from the coding regions of the sprA and spr B genes 
indicates an overall homology of 54% on tire basis of emino aeid identity. As indicated in Table I, the 
sequence homology is not uniformly distributed throughout fhe coding region of the sprA and sprB genes* 
' The csrboxy-termmaJ domains of the proteases A and 6 are ?S% homologous as noted under the heading 
"CT protease* whereas the average homology for the remainder of the ceding region is only 45%, indicated 
under the heading "NT protease*. The amino terminal domains containing the signs! sod propeptide 
regions were similar In both extent oi homology and distribution of consensus sequences, as indicated 
under tne headings "signal" and "propeptide", fhe unexpectedly nigh DNA sequence homology relative to 
that of the protein sequences is particularly due to the 81% conservation in the third position of each cedon 
of the sequence . These investigations, revealing ihe dose homology between sprA end sprB genes, 
suggest that both genes originated by duplication of a common ancestral gene. With appropriate care and 
investigation, the commonality of the signal peptides can be determined, thus establishing the cue for 
secretion of proteins end hence providing sufficient information to construct, from the signs! DNA oi sprA 
' end sp/B, a single nucleic acid sequence which will be competent to direct protein secretion. 

in accordance with the invention, a recombinant DNA sequence can be developed which encodes for 
desired protein where the expressed protein, in eoej»nctson with fhe signal peptide and optionally the 
propeptide, provide for secretion of the desired protein in proactive form. The recombinant DNA sequence 
may be inserted in a suitable vector for transforming a desired cell for manufacturing the protein. Suitable 
N5 expression vectors may include piasmids and viral phages. As is appreciated oy those skilled in the art, the 
bioectivity of secretory proteins is assured by establishing the correct configuration of intramolecular 
bisulphide bonds. Thus, suitable proHaryoite hosts may be selected for their abitrty to display enzymatic 
activity of a type typified by, but: not limited to, that of protein disoiphide oxidoreductase, EC 5,3.4,1, 

The particular protein encoded by She recombinant DNA seqence may include eukaryotic secretory 
enzymes, such as prochymcsln, chymotrypste, trypsins, amylases, iigninases, chymosin, aiastases, lipases, 
aod ceiiuiases; prokaryoifc: secretory enzymes such as glucose, isomerese, amylases, lipases, pectinases, 
celluloses, proteinases, oxidases, ligoises, blood factors, such as Factor VII! and Factor tx and Factor Viii- 
related biosynthetic blood coagulant proteins; tissue-type plasminogen activator; hormones, such as 
preinsuiln; tymphokines, such as beta and gamma-lnterferon, and interleukin-2; enzyme inhibitors, such as 
extracellular proteins whose action is to destroy aeSbloties either enzymatlcaffy or by binding, for example, 
a B-laetamase inhibitor, a-trypsin inhibitor; growth factors, sucft as organism or nerve growth factors, 
epidermal growth factors, tumor necrosis factors, colony stimulating factors; immunogScbufm-reiated mol- 
ecules, such as synthetic, designed, or engineered antibody molecules; cell receptors, -such as cholesterol 
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receptor; vim! mcfecutes, such as viral homagiutirsios, AfDS aniigtsri and frnmunagen, hepatitis 8 antigen 
and immwncgen, foot-and-mouth disease virus antigen and immonogen; bacterid surface effectors, suck as 
protein A; toxins such as protein insecticides, asgfcidas, fungicides, and biocides; and systemic proteins of 
medical importance, such as myocardial infarct protein (MIR), weight control factor (WCF), caitodc rate 
protein {CMP} and himiiri (HRD). 

One skilled in the aft can easily determine whether the use of any known or unknown organism will foe 
within ths scope of this invention in accordance with 8w shove discussion and the fotJowtrsg examples. 

Microorganisms which may be useful m this respect as potential prokaryotSc expression hosts Include: 
Order: 

AetsnomyeeiaSes: Family: Actinomyoetaceae Genus: M&mchenema, lactophera; Family Aciinofcaeteria; 
Genus Actinomyces, Agromyces, Arachina, Areanobacterterfl, Arthrohactir, Brevibactedtim. Ceiioiemonas, 
CoritX s< •• r Mw » t ■■•it. ChfSKOv:.. Pf<--»nfcre.»T»v Tspora. Rnrtibackm >n, Rothia; Family Ac- 
fioeptaneies: Genus Actinopianes, Oaclyk-sporarigium, Micromenospora; Family Noeardioiosro ac- 
tinomycoses: Genus Caseobaeter, Corynebactariorn, Mycobacterium, Nocardia. Rhodococcus; Family Strep- 



fospora, iwtcrospora, PSanospora, Spirfiiospota. sVeptosporano,iurn; Family Thornxsspora: Genus Actioosym 
Npearciiopsis;.. Thetrnophiiia, Family Microspore: Genus Actionospora, Saecharospara: Family Thsr- 



moaeiinomyeetas; Genus Thermoactsnomyc.es; and She other prokaryotic genera: Acefivibrks, Acefohacief, 
A^iiromobactor, AdnelebaeW, Aeromonas. Baeterionema, Bifidobacterium, Ftevobaeierkim, KurtRia, tac- 



m tofoacilios, Lsyconostoc, Mycobacteria. Propionipactenum. 

The Seiiowing species from ihe genus Strepiwmyces are identified as padiajiady suitable as hosts: 
iinidrashiiit:^ aibus anivtoivticos. aroentiolus a»reoiaoieos. aureus, candidus, ceiiostaiicos. ceffuioivticu^ 
eosiicoiof crsamofus diastolics larirrosus fiavnoUcs fiavrtorisaus frartiars kiiVwidix iuoriir.idir.ys 
S^SSSS- glo^gftoros, gn^eojus, gdseus, hygfasoopicos. li p nir^l^icus, tipdyttcos, |Mdans, 
n < n «.'<>(, m'^ je - -> parvus. ptoaodaQrrwBwwe , ^fcaios, grote^cw, 

and 1 

aciimycifi) 
ailxiniystr 

aotihtoticus 



38 diastatochromogenes 



oryjht 



fendaa 

gdS£Of£SCUS 



koqaneiensis 



parvuios 
<ss psuceiius 
miicuii 



Also, the fciiowmg eukaryofic hosts are potentially useful in the practice of this invention: 
so Absidfe. AcremoRlum, Agophiatopoca , Acrospefra , Aitemaria , Affirobotrys. Ajcotncha, Aweplrasjtiiaro. 

§§S£?H!*> §!§EES- ^2£E3> ^S^la> <Ml?M2S> ^SOSSKSg!' 

Chaetomium, Chrysosporium. Ci'rcinalfa, ClaCosDoriarn, Cfe'omastix, Ccccospera. Cochiioboius, Cunnin- 

ghameSa. C^rvuiaria, Custingophara, Daorymyces, Dacrycpnax. Oendryphion, Oictosporiom, DofalcmycS'S, 

" ISi^^^r ^am^o^r>u^|j^ 

Monociiotys, Moncssporfom, Worcheila, ^ortior^la, Mucor, fvSyeeiopirthofa, Mycfothecium, Net'rospoJ'a, 

Phaaocc>riO!eiius. Phanerocnaete, Pniabpnora, "piptocophals, Pisorotus, Proossia, Pycooportis, Rhiort- 



7 



EP 0 300 466 B1 

eiadielfa, Bhsaomucor, Rhntopus, Rhodotoruta, Bo&iiiarda, Sacdiaromyccs, Sehwanniomycea, 
Seoiacabaskiiam. Scopuiaiiopsfe, Scytalidiem, Siachybotrys, Tefracfuiurn, Tftamnidiuro, Thews ioaseus, 
T!>* n-.r lycs^. "U':K d^K% 1 .;«* t*G.si"\ Tortrfa. Torufops,x, tm" 'etas. r.;ctij Tr rtiociadfof •, 
trschoderma. Trie'hurus. Truncates. Ufociadium. Ustitaqc. Vemcuitium. Wardomycos. Xy logons, Yarrowia. 
s Preferred embodiments of the invention are exemplified in Ibe following procedures. Such procedures 
and results am by way of example and am oof intended to be in any way limited to the scope of the claims, 

jo Strains and Ptesmids 

Streptomyces griseus i'ATCC 55395) was obtained from the American Type Culture Collection. 
StreptoEnyees livkfans 88 (Bibb, MJ., 41. SchoUef. and S.N. Cohen {1980), A DNA cloning system for 
interspecies gene transfer to arrtibtoiic-productog Stretomyces, Nature 284:528-531) and the pfasmkls pU&i 
ts and pki?62 from the John 'teem institute; Thompson. C.J... T, Kiaser, J.M. Ward, and DA Hopwood {1882), 
Physical analysis of antibiotic-resistance genes from Streptomyces and their use in vector construction, 
«ene 20:51-62; Kate, F.< C.J. Thompson, and DA Hopwooti {1983), Cloning and expression of ite 
tyrosinase gene Irom Sfreptomyces anttbioficos in Streptomyces ttvidans, J. Gen. Miorobioi., 12&2703- 

Piasmsds pUC8, pUCIS and 

so pUC19 w« purchased from Bethesda Research Laixirafories. 

Growth oi Streptomyces mycelium for Itse isolation of ONA or the preparation of protoplasts was as 
as described to Bopwood, DA, MJ. Bibb, K.F. Chalet, ¥.. Kieser. CJ. Bruton. H.M. Kiaser, D.J, Lydiate, CP. 
Smith, JM Ward, and H, Schrempf (1985), Genetic Manipulation of Slrepfomyess, A Laboratory Manual, 
Ths John tones Foundation, Norwich, UK, Protoplasts of S. t'Mdans wars prepared by iysozyme treatment, 
transformed: with piasrmd ONA, and selected for resistance to thiosfrepton, as described in Bopwood, DA, 
MJ. Bibb, K.F, Chafer, T. Kieser, CJ. Breton, KM Kieser, OJ. f.ydiate, CP. Smith, J.M, Ward, and H. 
30 Schrempf (1885), ^g»^fe,j^>»^teji L ^ LL ^ L S^^ L ^yc^ < A v t^^^exy -- M^ua|, The John tones Foundation, 
Norwich, UK Transformsnts were screened for proteolytic or anxiolytic activity on LB plates containing 30 
ugmti thioslroptou, and either 1% skim milk or 1% corn starch, respectively. E colt Sransforments were 
grown on YT medium containing 50 ug.-'mf ampicttfin, 

ss Materials 

Oligonucleotides were synthesized using an Applied Biosysfem 380A DNA synthesiser. Columns, 
ptio&phorsmsdites, and reagents used for oligonucleotide synthesis were obtained from Applied Blosystems, 
fnc. through Technical Marketing Associates. Oligonucleotides were purified by polyacryfamtde gel eiec- 
«> iropboresis followed by DEAE cellulose chromatography, Emcyrnes for digesting and modifying ONA were 
purchased from New England Biolabs and used according to the supplier's recommendations. 
Radioisotopes f«-32P]dATP { 3000 CiAnmoi) and b-32P|ATP (-3000 Ci'Vnmoi) vvsro from Amsrsham. 
Thiostrepton was donated by Squibb. 

4S EXfiMPlB 1 - Isd^ion of DNA 

Chromosomal DNA was isofated from SSiepfomyces as desciibad in Chafer, K.F., D A. Hopwood. T, 
Kieser, and CJ. Thomson (10132). Gene cloning in Sttsptomycas, Curr, Topics Microbiol, immonoi,, 9S;8S- 
85. except that sodium dodecyl sarcosinafs (final cone, 0.5%) was substituted for sodium dodacyi sulfate, 

so Pfasmsd DNA of transformed S. Bretons was prepared by an alkaline lysis procedure as set cot in Hopwocd, 
DA, MJ. mm, K.P. Chafer, f? Kioser, C J. Sroton, H.M. Kieser, 0.d. Lydtate. CP. Smith. J.M. Ward, and H. 
Schrempf (1985), Genetic Manirjoiafion of Sfregjomyces, A Laboratory Manual, Hie -John tones Foundation, 
Norwich, UK. Piasmid DNA from E. colt was purified by a rapid boiling method (Hdmas, D,S„ and M. 
Guigiey {1981). A rapid boiling method for the preparation of bacterial piasmids, Anal. BSochsm,, 114:183" 

S5 187). DNA fragments and vectors used for al eonstrucfJons were sepsraisc: by electrophoresis on tow 
meltifsg point agarose, and purified from the molten agarose by phenol extraction. 
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EXAMPLE 2 - Construction of G&ftomjc library 

Chromosomal DNA ot S. griseus ATCC 15395 was digested to completion of BamHi mi fractionated 
by electrophoresis on a 0.8% Sow melting point agarose gel, DMA fragments ranging in size from 4 to \Z 

s kiiobase pairs (to) were isolated from the agarose gel. The pissrotd vectors pUCtS and pUC19 were 
digested with BamHi and treated with caff mtestioaf alfcafine phosphatase {Boehringer Mannheim). The 8. 
griseus BamHi fragments {0,3 ug) end vectors {0,8 eg) were iigated in a final volume of 20 u! as described 
in Maniatss, t., E.F, Fritsch, and J. Sambrook 0982), Moj^uter Ctop[og, A laboratory Manual Coid Spring 
Harbor laboratory, Coid Spring Harbor. NY), Approximately 8000 iransform&ms of HBtOt were obtained 

jo from each ligation reaction. 

EXAMPLE 3 - Sobctonirsg at Protease Gene Fragments 

A hybrid Sfreploroyeeg-E. colt vector was constructed by figafing pU702 s which had boost ilnearb'sd by 
n BamHi, into ihe BamHi site of pUC8. The unique Bgiti srta of this vector was used for sobctoosng BamHi 
and Bgiii fragments fo Use protease genes. Other fragments were adapted with BamHi Sinkers to facilitate 
ligation into Ins Bg)l\ site. The hybrid vector, with pUC8 inserted at ihe BamHS site of pU702, was irtesp-able 
of replicating Sirepforoyces. However, the E. cm pfasmid could bo readily removed prior to transforming 3, 
8v«dam by digestion with BamHi followed by mdrcuiariaaiion with T4 Ugase. 

w 

EXAMPLE 4 - Construction lot Testing the spt8 Sigoal Peptide 

Tha 0.4 kb S«u3A!-Ncol fragment, containing the aminoglycoside phosphotransferase gene promoter 
was isolated from pU51 and subcioned info the BamHi and Ncef sites of a suitable vector. The Hcoi site 

g$ containing the initiator AT<3 'was joined So the Mlo) site of Ihe spsB signat using two 43-rner oligonucleotides, 
which reconstructed the amioo-terminus ot the sigoaf peptide. Art amylase gone of S. griseus was adapted 
by Sigating a 14-mar Pstl linker to a SmaS site in the third oodoo. This removed the signs! peptide and 
restored the ammo-terminus of the mature amylase. The Heeli site of ihe sprB signal was joined to tito Psti 
sits of too amylase subclone using two 28-roer oligonucleotides, which reconstructed Ihe eafhoxy-terminus 

so of the sigrsai peptide. 

EXAMPI.F 5 - Hyhr ideation 

A 20-mer (5'TTCCC(C/G)AACaaCGACTACQ<^ oligonucleotide was designed from an amino acid 
38 sequence- (FPNNDY0) which was common to both proteases. For use as a hybridization probe, the 
oligonucleotide was end-labelled using T4 polynucleotide kinase (New England Bioiabs) and [>-32P)ATP. 
Digested genomic ot pfasmid DNA was transferred to a Hyhood-N nylon membrane (Amersham) by 
eiectrobiofting and hybridized in the presence of formamide (80%) as described in Hopwood, DA, bid. 
Bibb. K.F. Cbater. T. Kieser. C.d. Brufon. H.M. Kieser. D.J. lydiate, CP. Smifh. J.M. Ward, and H, Schrempf 
*> {1S6S}, Genetic Manipulation of Streptomyees, A Laboratory Manual lite John tones Foundation, Norwich, 
UK, The filters were hybridized with the iabeliVd oiigosiuciaottde probe at 30 "C for ISh, and washed at 
4?*C. The S. griseus genomic library was screened by colony hybridisation as described to Wallace, RB... 
M*l. Johnson, f, Hiross, T. Mtyake, E.H, Kawasbima, and K, itakura The use o f sypthefic 

rsligonucfeotkfes as hybidteaSon probes, II. Hybridization of oligonucfeofldes ot mixed sequence to rabbit 
45 gjohsn DNA, fine? Ac ds Hss. 9 B'/S-B*^ 

EXAMPLE & - DNA Sequencing 

The sequences of sptA and sprB were dotetmined ustog a combtrtation of the chemical cleavage 
55 sequencing method (Maxam. A., and W. Gitbert {1977), A new method for s^wctag DNA , Proc. Nat!. 
Acad. Scl. U.S.A.. ?4;580-584) and the dideo^y sequencing method {Sanger, F„ S. Nickien, and AH. 
Coufsors (1977), DMA sequencing wtto cftato terminatirtg inhibitors, Proe Natl, Acad, Sci, U.S.A., 
74:5463:5467). Restriction fragments were ersd-fabeted using either polynucleotide kinase or the large 
fragment of DNA Polymerase ! (Amersham}, with the appropriate radiolabeled nucleoside triphosphate. 
m labeled fragments were either digested with a second restriction eodonuclease or strand-separated, 
followed by eiecfroefution from a polyacryiamide get. Sufeciohes were prepared in the IVI13 baetortopttage 
and the dsdeoxy sequencing reactions were run using the -20 universal primer (New England Bioiaos), In 
some areas of strong secondary structure, compressions and polymerase failure necessitated the use of 
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either inosine (Mills, 0 R., and F.R. Kramer (1979), Structure independent nucleotide sequence analysis, 
PtoC- Nail. Acad. Set, U.S.A., 78 2232-2235) or l-^km&yuma&m fMiausana, §.< S. Nishimura, and F Seeia 
(1986), fenprow-rnent of She dtdeoxy chain termination method of ONA sequencing by ass of deoxy-7- 
deazoguanosine tripospbate in ptaee of c'6TP. Nucleic Acids Res,, 14: i 319-1324} analogs in the olcteoxy 
s reacfions to clarify the sequence. The sequence were compiled using the software of DNASTAR iw> - 
{Ooggette, P.E., and F.R, Bfatinef (1986), Persona! access of sequence databases on personal eompatefs, 
Nucleic Acids Res., 1 4:61 1-618). 

Claims 

m Claims lor the following Contracting States t AT, BE, CH, 0E, FR, QB, GR, (T, LI, LU, fiL, S£ 

1. The ONA signal sequence of Fig 4A 

2. The DNA signal sequence of Fig. 5A. 

tS 

3. A vector comprising the signal sequence defined in claim t or claim 2 and also a sequence encoding a 
desired protein fused iharato. 

4. The vector of claim 3, which is a plasmid or phage. 

se 

5. A {ransformad prokasyotio ceil comptisieg She vector of claim 3 or claim 4, which Is capable of 
expressing said sequences, as a fusion protein. 

6. The cell of claim 5, which is of the genus Streptoroyces . 

m 

7. The coll of claim 8, which is S. fivtoans or S. griseus. 

8. A meihed for preparing a desired protein, which comprises cuiiur iraj ihe ceil of any of claims 5 to 7 to a 
milder!!: medium, the fusion protein being produced as an intermediate and She signal sequence 

so directing secretion of the desired protein from the ceil 

Claims for the following Contracting State : ES 

1. A process lor preparing a ONA vector, comprising introducing, (he signal sequence of Fig. 4A or Fig. SA 
m and also a sequence encoding a desired protein fused thereto. 

2, The process of claim 1 . wherein the vector is a piasmid or phage 

3, A process for preparing a transformed prokaryofie eeii, comprising transformation with the vector of 
«> claim 1 or claim 2, whereby the cell is capabfe of expressing said sequences, as a fusion profeio. 

4. The process of claim 3, wherein the celf is of the geoos Streptorrryces, 
&, The process of claim 4. wherein She coif is $. iivtoans or $. §ds«;s. 

8, A method for preparing a desired protein, wWch comprises cuiturtog the cell of any of claims 3 to 5 in a 
nutfient medium, the fusion protein being produced as an intermedials and the signs! sequence 
directing secretion of the desired protein from the eeii. 

so Patentanspriiche 

Paientanspdiene fur folgende Vertragsetaaten ; AT, BE, CH, BE, FR, 08, GR, rr, Li, Li), ®U S£ 

1. DNA-Sigoaiseqoen2 von Fig. 4A, 

ss 3, DNA-StgnsteequenE von Fig, 5A. 

3. Vektor, Per die in Aospracn 1 Oder Anspruch 2 deftoierts Signaisequena sowie sine damti verknOpfte 
Seqeonz, die 1Ur etn gewuoschtes Protein kodiert, umfa3t. 
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4. Vektor nach Anspruch 3, der ein Plasroid oder Phage ist. 

5. Trsasfermierte profcaryoftsehs Zeite, weiche deu Vakfor naeh Anspruch 3 Oder Anspruch 4 umfc$8t, die 
imstsnde ist, die gsnsrmten Sequenzen sis Fusiomprotem zu oxprirmgreti. 

is 

8, 2e8e aach Anxprueii 5 dsf Gatl-jng Strapiomyces. 

?> Ze8s rsaeh Afisprueh 6, die S. IMd«n& oder S, grtseus ist. 

?(.< 8. Verfahren z-..w MersJeffurtg eines gewunschters Proteins, welches das Zuehten der Zeile nach irgendei- 
mm de? AnsprClche 5 bis 7 in einero NShrrcedium umtafit, wefoei das Fusiorssprotein ais- Zwischenpro- 
rjyki erzeugt wird ynd die Signaisequana die Sekretiort des gewunschten Proteins sus der Zetfe sseued. 

Patentanspriiche fur toigender* Verfcragsstaat * £S 

1. Ver&iven »sr Her$&iiuog ©ines ONA-Veklors, welches das Bmuhren titer Signalsequero: von Fig. 4A 
oder fig. 5A sowie einet damii verknQpfisn Sequeoz, die fur ein gewunschtes Preisin kixlisr!, ucrrfaBt 

3. Vadshreo ogch Anspmch 1 , wortrs ein Vetdor em Ptesmid Oder Phage- ist. 

m 

3, Vedahrert a»r Hersieliursg airier iransfarmieden profcatyotisohsri Zelie. uai las send die Tfarfsfcrrnaiiori 
mit dem Vektor naeh Anspruch i oder Ansprneh 2, wodurch die Zefle jmslaode rsf, die genaontea 
Sequenzen ais Fusionsprotein zu exprimieren. 

«s 4. Vertafaeo oacfe Anspruch 3, wodn die Zeite von der Gatlung Strs-'ptorrtycas ist. 

5, Vedshren naeh Anspruch 4, worfn die Zeile S, Svidam oder S. Qflssus ist, 

8, Vedahran zar Hersteiiung airtas gewunsehifcn Proteins, welches das ZGchtan der Zaile each irgsodei- 
so mm der AnsprQchs 3 bis 5 In einern NShrmediwa umfaSt, wohei das Fusionsprotein ais Zwisehanpro- 
d«kl erzeugt wird and die Stgnelsequanz die Sekretion des gewDnschfsn Proteins ms der Zelb steuert. 

HsvandScetiofts 

Bevendications pour fes Etats contractants suivants : AT, 8E, CH, DC, FR, GB, G% IT, tl, LU, NL, S£ 

1- Sequence d'AON signai de la 1igur« 4A. 

2, Sequence d'AON signal de ia figure fiA. 

*> 3. Vaciauf comportant fa sequence signal deiinia so revsodicaiion t ou re vindicate 2.. ai«si qua. 
{usianaea a ce vecteur. one sequence ood&at poar uae pftileias vcaiue. 

4, Vecfeur s«ion is revindication 3, qui est uo piasmide o« un phage. 

45 S, Cellule pfticaryote hansformee comporlanl te vectear selon ia revetxiicalioa 3 ou la revendicaSioa 4. 
capable d'expnmer lesdites sequences sous fa forme «furse pmteine fusionnee. 

6. G&iteie selon ia reveEKJicalion 5, du genre Stfeptotnyces- 

so 7, Cellule seion la reveadication 6, qui est S. livWens ou §. grjseys. 

5, Froceds: de preparation d'une protetae recisesehee. qui cornperto la mise ert culture de ia tMtte setoff 
i'yne queiconque das reveadicalions S a 7 dans m milieu nuldtif, !& protsiae lossonnee stasit prodalte 
en tant qulntermediairs, et ia sequence Signal didgeant la secretion de ia proteine recherchee hors de 

m ia cellule. 
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ftevusfjtlteations pour S'Etat contractant suivant : ES 

1. Proc&ie' <fe preparation d'm vecteur d'ADH camportant Slatreduction Ps ia sequence sigrssf da ia figure 
4A ou cte Is figure 5A, amsi que, faskmnSe a as vecteur. am sequence codant pour une proteins 
s voyiue 

5, Proe&Sl saion la revendicaifcfl 1 , dans teqyas ie veeteur est w piasmkte ou un phage. 

3. Precede tie preparation d'uns csiiuis ptocaryote transformer, coroportant -a transformation mm fo 
jo vacteur saion ia ravandicatkm 1 ou ia reveadicaSon 2, graces auquei ia ceHu& est capsbie d'expnmef 

fesdites sequences sous ia forme d'urse protease fusfosnes. 

4. Precede setoft ia rsvandicatkm 3, dans; iequei ia csffoie esi du genre Stfeptorrsyces. 

ts s, Precede salon la rsvendtcatfoa 4, dans iequai ia celtote est S. iiwdam ou S. gr|seus. 

8. P«xakJ4 da prtSp&ratior* d'uos proieine redaanxhaa, qui comports fa miss m ouHuso da ia c«ls*te scion 
i'une tiuefeoflqua das revaridicalions 3 a 5 dans un milieu aulrtfif, fa proteins iusionnea efanf prodwff.8 
an tans pulntamediaise. at la sequence signal dsrigeartf ia secretion d© fa proteina reehercase hors da 
so ia eellute. 
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FIG.3. 
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STC^CCeCCATCTCATTCCSGaTC^ 

CAC€GTGCMCTTG«XA6GT?CA£CCeA€TCa^^ 

( _ ..... HTfKSSfSPLSSTSS 

M^C836CC6TCCCCCCATACCTCG^S6AtcrCsr!»CCTTCMSCGCTTCTCSCCGCTCA0CASCAC6TCMa 

YARtlAVASStVAAAAlATPS&y&A 
ATATfiC ACSGCTCCTCGeCGTGGCC TCC SSCC TGSrQGCCSCCGCSGCCC TQGCCACCCCCTCfiCCGTCSCCSC 

-80 

^EAESKATVSQLADASSAILAftOVA 
TCeCaSGCGC&GTCCMGCCACCSm^ 

STAW¥T£AST6KlYtTA0STY$RAE 
GG5CACCGCCTGfirACACGGA6GCGACCAC56GCAAGATC6TCCTCACCGCCGACASCACCGTGTCGAA6GCCGA 
-20 

IHHHlASSSAKlIHUHSItf? 
AC T5GG C AAGGT C AX AAC SCGC T 6SCSGGCTCC AAG6CGAAAC TGACG6TC AASC6C GCCGA66SCAAGT TCA€ 

20 

?llAGSSAITTSOSRCSLSF«SfSVH 
C€CGCTSArC6C(^G<X;&A.SXCArCACCACC^ 

40 

8VAKAlTA3HCTK!SASWSt<sTRT5 
C^CeTCSCCCACKCOCACCXCS^ 

60 

JRilSHP AAAOGRVVl 
AACCAGCTTCCCGAACMC6ACTACGSCATCATCC6CCACTCGMCCCS^G6C3<3CCGACa6CCSSGICTftCCT 
SO 

rfJSSrQOiTTASKAfVGQAVQRSSS 
QTAC AAC 6GCTCCTACC A GGAC ATC AC SAC SGC&GGC AAC GCCTTT GTGG6GC AGGCC 6TCCAGC GCA6CGGCAG 

TTSlSSG$¥TGLNATV8Y«$$Gf¥Y 
CA€CACC«m«GCA«:GS€TCGSTCACCGSeC^^ 

540 

S«SQT8¥CA£?G£)SSSSLFA6STAL 
C<XXATGATCCA(»CCAAe6TCTGTGCCltf<^ 

?6G 

SLT$6SS6«C8TSfirTf1fQf»VTCAI. 
GSGTCICACCTCCGGCGGCAGTG^MCTGCCGG^CSSCGGCACCACSTTCTAC^ 
18t 

S A Y G A r V L * , "41-0 

^StSCCTACGSGC^CGGTCCT&TAX^ 

r ^o. a 

CCCCSCGCGACGCCCCACCCCSGCGGACCGTSCTCGCGCXGGTCCGCCCTCGCCGTGCCACGAACCCCACCGTC 

aTTCeeeGTCAGGCStCT{£CSCTC^ 

r^GTCCTGCCCTC^CACGGTCCSGTTC^ 

CMC€CCGTTSmS£6^7GASGTCGCGA1^ 

TC8AC 

FIGA 
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FIG AA. 




17 



BP 0 300 4S6 81 



FIG.4B. 




?£AE$KATV$Qi.ADASSAJlAAOVA 
TCCCGAGGCGGAGtCCAAQGCCACCGTTTCGCAGCTCSCC&ACGCCA6CTCC0CCATCCTCfiCCGCrGAT5T66C 

-40 

STAW*T£A$TGKIYl?AOSTVSKA£ 
aSCCACCi&CTG&TACAC^ 



LAKVSHAlASSRAfCllVfCftAESKFT 
ACTSSCCAASSTCAGCMC&CGCT6GCGaGC1CCMG6CSAAACTGACGQTCAAGCGCSCCGAGSGCAAGTtCAC 
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FIG.4C. 




GVAHAITASHCTN I SASWS I 6TRT0 
CGC^STCGCCCAC(XCCTCACC8CCSGCCACTGCACCAACATCA6CGCCAaTa6TCCATC36CAC0CSCACCQS 

60 

I $ F M » 8 f fi I IRHSttPAAAOCftVYL 
A#CCASaTCCC6AACMCGACTACG(XATCATCCGCeACTC^ 



80 

f M 5 ? (j 0 1 TTA6NAFV6QAVQ8SSS 
6?ACAAC6fiCTCCTACCASGACATCAC6AC6GCGGGCAACGCCTTTGTGGGGCA6CCCGtCCAGCGC*6C6GCA6 
100 520 
TT$tS$6$¥T6lHAT¥NYS$S$J y V 
CAC€ACC6G(KTGC8CASC0^TCSSTCACCC«CTCMC(XCACaGTCAACtAC58TTCCAGCGSGATCStSIA 

140 

S«iQTMVCA£PG0S66SlFAGSTAL 
C5GCATSATCCA6ACCAACGTCTGTGCCGA(XCCaGTGAeAGT£GAG{KTCSCTCTTCGCGGGCA6CACCSCTCT 

isa 

6tTS6SS6»C8TSS77F»QPvrEAU 
G6GTCTCACCTCCGGC(»CA6T(»CAACTGCCGGACC5GCG5CACCACGTTCTACCAGCCC6TCACCGA6GC&CT 




m 
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B 



COeTTTrQaGGaGGAGCSMACCSSATCS^SCCT^CCCSCaCASGCCCCACCeGCCCCGGCAaCCGCACaS 



CTCCCOG^C&STG^CGGATGTGACCCKGTGGCCWftSCATICTTGCGTCCCCCGTCCGGCCCCCTCSATA 
CTCCfiGTCAGCSATTGTCAG^KACGGCfiAATTCSAAATCCGG^CASGCCCCCGACJSCSCCTCACGGKCCGC 

CAeCCCACAGSASC^CCCC^raeCCTCG^^ 

AARR¥RTTA¥lA$tAA¥AALA¥PTA 
CGCSGCSAffiCGCaTCCGCACCACCGCCGTACTCGCS^CCTC^CSCCSTCGCG^GCT'SSCCGTTCCCACCSC 

X A t TPRTfSANqi TAASDAVLSAQi 
SAACGCCSAAACCCCCCS&ACGTTCAST^ 

ASTAWHI0?qSKRlV»rV0STVS<A 
CGCSGGCACCGCCTGGAACATCfiACCCGCAGTCCAAGCGCCTCG'TCGTCACCGTCGACAGCACGGTCTCGMOGC 

E1XQIKKSA6AHA0AIRIERTPSKF 
GMWlCAACCAaTCMaASTCGSCGGG^^ 

J « U $ H H H $ U G II C 5 U f M H 
CACCMSCTGATCTCCGGCGr^GAC6CGATCTACTCCAGCACC56ACK;TG£TC«;TCGG€nCAACGTCCSCA<5 

SSTVYFlTASHCTOCATTHWAtlSAB 
C0GCAGCACCTACTACTTCCTSACCGCCGGCCACTGCACGGACGGCGCGACCACCTGGT66GC6AACTCGGCCCG 

<6o 

TT\fLSTrSQSSff>«KDY«iVRrT8T 
CACC AC GSTSC TCSGCAC 6AC C TCCGGGT C GAGC TTCCCGAACAAC GACTAC GGC ATC GTSCGC T AC ACC AAC AC 
SO 

T f PK&GTVSGQOI TSAAfJATYGHAV 
C'ACCATTCCCAAGSACGGCACG^TCGSCGKCAGGACATCACCASCGCCGCCMCGCCACCSTCGGCATGGCGST 

T88aSTT6TH$6SVTAL»AT¥»ysS 
CACCeGeCI&e^CTCCACeACCC^ACeCAa^ 

GGVVYGM1 RTi<YCAS?GD$GGPLY$ 
CG^^CGTCGtCTACC^ATSATCCSCACCMCGTGTSCGCSGAGCCCGCCGACTCC 

STRAISlTSGSSSKCSSSSTTPFqp 
CGGCACCCG8SCGATC^TCTGACCTCCGGC8SCA(KSeCMCTCCTCCTCCGGCSGCACGACCTTCTTC 

YTEAISAY&VSVY* 
GSICACCSAGSCGCTS^^GTACSGCGTCAGC^ 



CGTACAWGTSCCCCCSTCCGGAm^^ 
ACGACG3GTCGCCGCTGCSC6TC 
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FIG.5A. 
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R6.5B. 
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HG.5C. 




8STYY?LTASHCT06A TrwwANSAR 
CGGCASaCCTACTACTTCCTGACC«C&KCACTGCACeGACG0CGCGACCACCTS6TGGGCGMCTCG6CCC8 



6c 

TTVlSTTSSSSFfSNDYSiVftYTST 

CACCACGGT(KTC(^ACWCfCC6CGTCWaTTCCC(^CAACGACTACGGCATC6TGCGCTACACCAACAC 
SO 

TiPKOGTVSGQDlTSAAHATVGHAV 
CACCATTCCGAA«6AC<>eC^G6IG68C 

iOO jJO 

T8RGSTT8T«$6S»TALIIATy»Y66 
CACCCCCSCS6CTCCACCACCGGCACCCACA6CS6TTC6GTCACCa:ACTCAACa;CAeC^TCAA€TAC^SSS 

*«*© 

S D V V y G M I RTHVCAEPGOSGGPIYS 
€GSCSACGItSTCTAC6SCATGATCCGCACCMCSt6^ 

!f>0 

GT8A IGlTSSGSGNCSSGGTTFFQP 
CSGCAC€CGGGCGATCGGTCTGACCTCC&6CGGCAi&^ 
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FIG.6A. 

A 

SprA MT f KR FSPtSSTSRYARllAVASGLVAAAAlATPSAVA 

H KU S ft R AV GL A A ALA HA 
SprS MRURTSNRSNAARRVRTTAVLAGLAAVAAUVPTAKA 

RG.6B, 

B 

SprA APEAESKATVSGIADASSA! IAA0VAGT AWYTEASTOCJ 

A Ql AS A L AO AGTAW 
sprS ETPRTFSAN- -OLTAASOAVlGAOIAGTAWMIDPQSm 

sprA VlTAOSTVStCAEtAKVSKAtAGSXAlC- tTVKRAEGKFTPt 
V T DSTVSKAE AG A t R GKFT t 

SprB VVtVDSTVSKAE I NG1 KKS* AGANADAtRJ ERTPGKf TKl 

RG.6C. 

C 

SprA 1 AGGEAlTTGGSRCStGFNVSVNGVAHAlTAGHCTNIS 

I GG A! RCStGFHV LTAGHCT 

SprB { SGGDAl YSSTGRCSLGf NVRSGSTYYFITAGHCTOGA 

SprA ASWS I GTRTGTSFPNNOYGl IRHSNPAAA* 

W GT G SFPNNOYGI R H 

sprS TTWWANSARTtVtGTTSGSSFPNNOYGI VRYTNTT JPK 

sprA DGRVYLYNGSYQDITTAGMAFVGOAVQRSGSTTGLRSG 
DG V G QD1T A HA VG AV R GSTTG SG 
SprS DGTV GG-QDITSAANATVGMAVTRRGSTTGTHSG 

SprA SVTGlNATVNYGSSGl VYGMf QTNVCAEPGDSGGSIFA 

SVT LNATVNYS VYGMI TNVCAEPGDSGG t 
SprS SVTAINATVNYGGGDVVYGHIRTNVCAEPGOSGGPLYS 

SprA GSTAlGLTSGGSGNCRTGGTTFYQPVTEAl SAY GAT VI 

G A GLTSGGS6KC GGTYF GPVTEAtSAYG V 
SprS GTRA I GL TSGGSGNCSSGGTTFFQPVTEAtSAYGVSVY 
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